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Abstract 

Recently, Han discovered two formulas involving binary trees which have 



^ | the interesting property that hooklengths appear as exponents. The purpose 

qq ■ of this note is to give a probabilistic proof of one of Han's formulas. Yang has 

<^> generalized Han's results to ordered trees. We show how the probabilistic 

approach can also be used in Yang's setting, as well as for a generalization 
of Han's formula in terms of certain infinite trees. 



1 Introduction and definitions 

Frame, Robinson, and Thrall PQ discovered the hook formula for standard Young 
tableaux. Greene, Nijenhuis, and Wilf [2] then gave a probabilistic proof of this 
result where the hook lengths appeared in a very natural way. The same trio also 
used probabilistic methods to prove the sum of squares formula for the symmetric 
group [3 J . Sagan [Tj and Sagan and Yeh [8] gave probabilistic algorithms for proving 
hook formulas for shifted Young tableaux and trees, respectively. 

Recently, inspired by an identity of Postnikov [6], Han jlj proved two identities 
involving binary trees which have the interesting property that hooklengths appear 
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Figure 1: The trees in B(3) 



as exponents. (Han [5] also discovered an identity with this same property which 
generalizes Postnikov's.) Han's demonstration was by algebraic manipulation of 
recursions. Yang |9] generalized Han's identities to weighted ordered trees. Again, 
the proofs were algebraic in nature, this time using generating functions. 

The purpose of this note is to give a probabilistic proof of Han's first formula 
which is similar in some ways to the second algorithm of Greene, Nijenhuis, and 
Wilf. A weighted version of the algorithm proves the analogous identity of Yang. 
A second generalization of Han's original formula to certain infinite trees is also 
demonstrated by this method. The rest of this section is devoted to the necessary 
definitions to state the identities to be proved. Section [2] gives the probabilistic 
algorithm and proofs. The final section is devoted to indicating how Han's second 
formula might be demonstrated probabilistically. 

For any tree, T, we denote the vertex set of T by V(T). If no confusion will 
result, we will often write v G T and |T| in place of the more cumbersome v G V(T) 
and |V(T)|, where | • | denotes cardinality. If T is rooted and d6T, then the set of 
children of v will be denoted C v , and we let c v = \C V \. The hook ofv, H v , is the set 
of descendents of v (including v itself) with corresponding hook length h v = \H V \. 

A binary tree, T, is a rooted tree where every vertex has either no children, a 
left child, a right child, or both children. Let B denote the family of all binary 
trees and let 



For example, the trees in B(3) are displayed in Figure [TJ In what follows, we will 
use similar notation for other families of trees. The formula of Han which we will 
prove is as follows. 

Theorem 1.1 (Han). For each positive integer n we have 



B{n) = {TeB : \T\ =n}. 
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Figure 2: The trees in 0(4) and their weights 



Now consider finite ordered trees weighted by 



w 



cr> = n 



vET 



where m is a variable. Let O denote the family of weighted ordered trees. The 
trees in 0(4) along with their weights are shown in Figure [2j Then the identity of 
Yang we are considering is equivalent to: 



Theorem 1.2 (Yang). For each positive integer n we have 
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Some comments about this result are in order. First of all, it is remarkable 
because the right-hand side of the equation does not depend on m. Secondly, it 
implies Han's formula by letting m — 2, since then w(T) is just the number of 
ways of turning an ordered tree into a binary tree. Finally, Yang's weighting was 
actually 
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where s is another parameter. So one can recover Yang's equation by multiplying 
both sides of ([2]) by s n_1 . Also, Yang assumes that m and s are constants satisfying 
certain conditions, but it is clearly not necessary to do so. 
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Figure 3: The subtrees in T(3) for the given T 



For our second generalization of equation ([T]), consider a fixed, infinite, ordered 
tree T such that < c v < oo for all v G T. We are using c v for the number of 
children of v to emphasize that this is being calculated in T. Let T be the family 
of all subtrees of T. Since T is ordered, its vertices are distinguishable, i.e., V(T) 
is a set rather than a multiset. So we consider two subtrees T,T' to be equal if 
and only if V(T) = V(T'). For example, T = B if we let T be the tree with c v = 2 
for all u6T. As another illustration, Figure [3] shows part of one possible T and 
the corresponding trees in T(3). 

Theorem 1.3. For each positive integer n and each tree T satisfying the above 
restrictions, we have 

TeT(n)oeT v « 

2 The algorithm 

For any rooted tree T, an increasing labeling ofT is a bijection 

* : r-{i,2,...,|r|} 
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such that for any v G T and any w G C v we have £(i>) < £(w). We will often 
let L = L(T) stand for an increasing labeling of T viewed as T with the labels 
attached to its vertices. Similarly, we will write L(F) for the family of all increasing 
labelings of trees in the family T . Let f T be the number of increasing labelings 
L(T) where T has distinguishable vertices. The following hook length formula for 
f T is well known and easy to prove 



So if we multiply any of the three identities from the previous section by n\, we 
obtain a sum of the form 

E f T ^ T ) = 1 

TeF(n) 

where T is a family of trees and ir(T) is a product. We wish to interpret 7r(T) 
as the probability of obtaining an increasing labeling L of T for an appropriate 
probability distribution on L{!F{n)). The identity will then follow since 

1 = £ Prob(L) = £ £ 7r(T) = £ f T ir(T). 

L£L(T(n)) TeF{n) L=L(T) TeF(n) 

Note that Prob(L) will depend on T where L = L(T), but not on the specific 
labeling of T. 

The probability distribution will be obtained by specifying the parameters in 
the following basic algorithm which takes as input a positive integer n and a family 
of trees T and outputs a labeling L of some T G 

1. Let L consist of a single root labeled 1. 

2. While \L\ < n, consider all possible leaves v one could add to L and still stay 
in Iv(jF). Pick one such leaf, label it \L\ + 1, and add it to L with probability 
Prob(f , L). 

3. Output L. 

It will be convenient to also use the notation Prob(t> , L) when v G L. In that case, 
it should be interpreted as Prob(w, L') where L' is subtree of L induced by those 
vertices with labels less than £(v). 

To finish the proofs, we just need to specify for each of the three families 
what the probabilities Prob(t> , L) are, and prove that they describe a probality 
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Figure 4: A tree in £>(3) and the probabilities of its additional leaves 



distribution such that all increasing labelings L of a given tree are equally likely 
with the common value being 

Prob(L) = Y[Proh(v,L) = tt(T). 

veL 

Proof of (TJJj. Given a tree T rooted at r and v G T we let P v be the unique 
r-v path. The depth ofv, d v , is the length of P v . In the algorithm, let 

Piob(v,L) = ^-. 

For example, Figure H] shows one of the trees T in £>(3) along with these proba- 
bilities for each possible leaf v which could be added to T. To further distinguish 
such leaves from the nodes of T, the corresponding edges are dashed. 

We first need a lemma which will be used in all three proofs. So we will state 
it in general terms. 

Lemma 2.1. For each of the three families T under consideration and for each 
L G L{T^) we have 

^Prob(w,L) = 1 (5) 

V 

where the sum is over all leaves v which could be added to L. 
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Proof of the Lemma for T = B. We induct on \L\ where the base case is easy to 
do. Given L, let w be the leaf of L such that £(w) = \L\ and let V = L — w. Then 
the terms in the sum for V are the same as those in the sum for L except that the 
summand l/2 dw in the former has been replaced by l/2 dw+1 + l/2 dw+1 . Since these 
two expressions are equal, so are the sums, and we are done by induction. □ 

Next we need to verify that for L = L(T) we have Prob(L) = vr(T), i.e., 

p - b w = n 

Again, let £(w) = \L\ and V = L — w. Then the hook lengths in L are the same as 
those in V except that the d w vertices on the path P w — w have all been increased 
by one. Note also that w itself does not contribute to the product above since 
l/2 hw ~ 1 = 1. So, by induction, 

Prob(L) = ProbK L') Prob(L') = ±- \{ ^ = J] (6) 

v'eT' i>gt 

There remains to show that Prob(L) defines a probability distribution. But 
using the Lemma and induction as well as our usual notation: 

Prob(L) = Prob(w, L') Prob(L') 

LeL(B(n)) LeL(B(n)) 

PToh(L')J2 FToh (w,L') 

L'eL(B(n-l)) w 

Prob(L') 

L'eL{B{n-l)) 
= 1. 

This finishes the proof of (pQ). □ 

Note that the proof that Prob(L) forms a probability distribution only depends 
on Lemma 12. 11 So in the next two proofs, we will skip this step. 

Proof of Note that the left-hand side of ([2]) is a rational function of m, 
so clearing denominators gives a polynomial equation. Thus it suffices to prove 
that this identity holds for infinitely many values of m. We will provide a proof 
for all real numbers m > M where M is chosen sufficiently large such that < 
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Figure 5: A tree in 0(4) and the probabilities of its additional leaves 



Prob(u,L) < 1 for all v G L G L(0(n)). This will be possible because Prob(t>,L) 
will be asymptotic to, but smaller than or equal to, l/m dv ~ 1 . Specifically, let 

Prob(w, L) 



(c p + l)m dv 

where p is the parent of v. Remember that, according to our convention following 
the description of the algorithm, c p is calculated in the subtree of L induced by 
those vertices with labels less that £{v). In particular, c p does not count v itself. 
Figure [5] displays a tree of 0(4) and the probabilities of the leaves which can be 
added to it. 

Our first order of business will be to prove Lemma [2. II in this setting. 

Proof of the Lemma for T — O. As before, we induct on L, keeping the 
notation the same as the first proof. We also let p be the parent of w and write 
p' if we are considering p as a vertex of V . So c p = ey + 1 and the terms in the 
sum for V corresponding to the c p > + 1 possible children which could be added to 
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Figure 6: A subtree in T(3) and additional leaves 
p' give a total of 



m — c p / m — c p + 1 



(cy + l)m dtu m d ™ 

In the sum for L, these terms are replaced by c p + 1 summands for children of p 
and one for a child of w, giving 

. m — c v m m — c„ + 1 

p (Cp + l)m dw m dw+l m d-w 

Since these two contributions are the same and all other terms in two sums match 
up, we are done. □ 

We next need to show that Prob(L) = tt(T) for T = O. Keeping our usual 
notation we have Prob(L)/ Prob(L') = Prob(w,L'). So the desired equality will 
follow by induction, the reasoning applied to obtain (jH]), and the computation 

AT) rU-QM*- 1 OO m-c, 

This completes the proof for O. □ 
Proof of (T3)]. For this case, we proceed as usual, but letting 



Prob(v,L)= Yl —. 



x£P v —v 
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Figure [6] gives an example using a tree from T(3) where T is as in Figured 

Proof of the Lemma for T = T. Now in passing from the sum for V to the 
sum for L, a single term rLsF -«> V<^ nas been replaced by c w terms all equal to 
IlxeP V^r- Clearly this does not change the sum. □ 

The proof that Prob(L) = 7r(t) is just like the one for B except that the hook 
length powers of 2 are replaced by powers of c x . So we are done with the case of 
T. □ 



3 An open problem 

As remarked in the introduction, Han actually proved two formulas in j3], both 
having hook lengths as exponents. We have unable to give a probabilistic proof 
of the second one. But will indicate how one might go in the hopes that someone 
else may be able to push it through. 

Call a binary tree complete if every vertex has or 2 children. Given a binary 
tree T on n nodes it has completion T which is the complete binary tree obtained 
from T by adding all n + 1 possible leaves. If T is the tree with the solid edges in 
Figure H] then T is the tree which also includes the dashed edges. It is not hard to 
show using (Tjfl) that 

f (2n + l)! 

Han's second formula is 

En 



(2^ + l)2 2 ^" 1 (2n + l)f 
Using (j7|), this can be rewritten as 

It would be very interesting to find a probability distribution on increasing labelings 
of complete trees T whose probabilities are given by YlveT l/2 2/l " _1 . Once this is 
done, similar ideas should prove the generalization to O due to Yang [9]. It is not 
clear how to generalize Han's formula to the T case, but would be interesting to 
do if possible. 
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